Optimal combinations of imperfect objects 
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We consider how to make best use of imperfect objects, such as defective analog and digital com- 
ponents. We show that perfect, or near-perfect, devices can be constructed by taking combinations 
of such defects. Any remaining objects can be recycled efficiently. In addition to its practical appli- 
cations, our 'defect combination problem' provides a novel generalization of classical optimization 
problems. 
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Imperfection is an integral part of Nature, but it can- 
not always be tolerated. High-technology devices, for 
example, must be precise and dependable. The design of 
such dependable devices is the domain of fault-tolerant 
computing, where the goal is to optimize reliability, avail- 
ability or efficiency of redundant systems Q . Such redun- 
dant systems are typically built from devices which are 
initially defect-free and hence pass the quality check, but 
may later develop faults. 

A much less studied problem, but one of significant 
economic and ecologic importance, is what to do with a 
component which is already known to be defective. Com- 
ponents with minor defects are sometimes acceptable for 
low-end devices gj. The Teramac, a massively parallel 
computer, was built from partially defective conventional 
components; it uses an adaptive wiring scheme between 
the components in order to avoid the defects, and the 
wires themselves can be defective ||. More typically, 
however, a component that is known to be defective is 
considered 'useless' and is hence wasted. 

Here we address this wastage issue by relating it to 
a novel optimization problem: given a set of N imper- 
fect components, find a combination, or subset, that opti- 
mizes the average error (analog components) , or the num- 
ber of working transformations (digital components). We 
employ methods from statistical physics to show that per- 
fect, or near-perfect, devices can indeed be constructed, 
and remaining objects can be recycled efficiently with 
(almost) zero net wastage. Note however that combining 
simple analog devices such as thermometers, is not at- 
tractive since it is usually much easier and cheaper to sub- 
tract the errors from the outputs. But such active error- 
correction may not be practical in more complex sys- 
tems, particularly for next-generation technologies in the 
ultrasmall nano/micro regime. Nanoscale devices such 
as Coulomb-blockade transistors may enable us to push 
back the limits of Moore's law (see Ref. [Q for a review). 
However, the accuracy of the current produced at a given 
analog voltage will depend sensitively on the reliability 
of the nanostructure's fabrication. Similarly, the discrete 
optical transitions in semiconductor quantum dots ^ can 
provide useful digital components for nanoscale classical 
computing |6). However, digital switching can only occur 



if the energy levels coincide with the external light fre- 
quency. The accuracy of these energy levels also depends 
on the precision of fabrication. However even in self- 
assembled quantum dot structures, such as the ground- 
breaking virus-controlled self-assembly scheme of Ref. Q] 
where quantum dots are mass-produced, no two individ- 
ual dots will ever be identical - each will contain an in- 
herent, time-independent systematic defect as compared 
to the intended design. Yet it would be highly undesir- 
able to discard such nanostructures given the potential 
applications of such 'bio-nano' structures. 

Consider an analog device such as a nanoscale tran- 
sistor, registering a current A + a given a particular ap- 
plied voltage, with A being the actual value and a being 
the systematic error (8j. Suppose fabrication has pro- 
duced a batch of N imprecise devices whose errors a, 
(i = 1, 2, • • • N) were created when the objects were built 
and remain constant; this amounts to drawing them from 
a known distribution P(a). For simplicity, we suppose 
that P(a) is Gaussian with average /i and variance a = 1 
H . The most precise component has an error of order . 
What should one do with the others? Generally speaking, 
one could combine them such that their defects compen- 
sate. Computing the average of all components leads to 
an error of order fi ± which vanishes very slowly for 
large N, and even then only if /j. = 0. Nevertheless this 
method has been used in many contexts throughout his- 
tory, for example by sailors who often took several clocks 
on board ship Jig] , 

The optimal combination is actually obtained by tak- 
ing a well-chosen subset of the N components, i.e. a sub- 
set containing M < N devices whose errors compensate 
best. The problem therefore consists of selecting some 
of the numbers ai such that the absolute value of their 
average is minimized: hence the interesting quantity is 
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Here rij € {0, 1} selects whether device j is used or not, 
while M = ^2jLi ni is the total number of devices used. 
Without division by M, this problem — which we call 
the defect combination problem (DCP) — would be sim- 
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ilar to the subset sum problem or number partitioning 
problem (NPP) [jy]. Both problems are equivalent and 
known to be NP-complete: in the worst case, there is no 
method which finds the minimum in polynomial time, i.e. 
significantly faster than brute force enumeration (expo- 
nential time) . However the typical, i.e. average, problem 
has a different behavior. It will undergo a transition be- 
tween a computationally hard phase where the average 
error is greater than zero, and a computationally easy 
phase where the error is zero p2[ , ^3| . The same applies 
in our present case. These two phases, and the transition 
between them, can be studied using statistical physics 
0-0. 
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FIG. 1. Unconstrained problem: the average error (e) ver- 
sus the total number of components TV obtained by numerical 
evaluation. Average taken over 20000 samples. The solid 
line shows the behavior of the theoretical upper bound, for a 
particular choice of the constant C . 

Figure |l| reports numerical results obtained by enu- 
merating all possible {rii}, where [i = 0. The resulting 
device precision can be quite remarkable. Our numeri- 
cal simulations also confirm that (M) = TV/2 for large 
TV, i.e. the optimal configuration uses half the com- 
ponents on average. Strong fluctuations remain even 
for a large number of realizations and for large TV, be- 
cause of the non self-averaging nature of the problem: 
i.e. ^) - <e)V(e) - 1. 

The division by M in Eq (|l|) makes the DCP much 
harder to tackle analytically than the NPP. Let us com- 
pare the DCP with mini?, the problem defined by find- 
ing the minimum of E. Numerical simulations show that 
for the latter problem (Ai*) a . m j n E = for sufficiently 



large TV, where M* — Ylj=i n j ^ s number of selected 
components in the configuration {n*} that minimizes E 
IbSf . This makes sense, since in the DCP the division 
by M favors configurations with a larger number of com- 
ponents. In addition both problems are related by the 
inequality 
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for some constant C . Computing the typical properties 
of mini? hence yields an upper bound to the average op- 
timal error. Following Rcf. Jl3[ , wc computed the parti- 
tion function Z = S{ti ; } e ^ ' s 
yields 



which for large TV 
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where G(y) = (ln(cos(a/?tan(j/)/2))). Using the saddle 
point approximation for Z and an argument of positive 

entropy ||, we find (min {ni} E) ~ ^f2~ N . Hence 
there is a constant C such that 
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Figure [j] shows the behavior of the analytically ob- 
tained upper-bound for the average error is consistent 
with the corresponding numerical results. The same 
calculus shows that the DCP will also exhibit a phase 
transition between hard and easy problems when b < 
TV — ln(7riV/6)/2, where b is the number of bits needed 
to encode the a^s. Hence it is possible to obtain perfect 
error-free combinations of such imperfect objects for N 
large and b sufficiently small. When the defects are bi- 
ased, i.e. fi 7^ 0, the error increases as fi increases but 
remains low for fi < 1. When the errors of the compo- 
nents all have the same sign, only one component is used 
and the resulting error increases linearly with /i. 

We now consider the constrained DCP, where the num- 
ber of components to be used is pre-defined to be a par- 
ticular value M. If M = 1, one selects the least imprecise 
component. The case M = N amounts to computing the 
average over all N components, hence (e) = w2/(wN). 
This problem is a more complicated version of the subset 
sum problem: in our case the numbers are no longer 
restricted to be positive and the cost function is the ab- 
solute value of the sum. Fig. || plots the average and me- 
dian optimal error as a function of M for TV = 10 and 20. 
An exponential fit of the minimum error for increasing TV 
gives minjvf(e) ~ exp(-KN) where K — 0.56 ± 0.01, to 
be compared with — In 2 = 0.693 ... in the unconstrained 
case; the functional form of this quantity may however 
be more complicated than an exponential. 

We have also applied Derrida's random cost approach 
14|. Given the TV errors {a^}, there corresponds 
Vg) sets {n'j} that obey the 
If the E are indepen- 
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constraint J2i n i = M jl 

dent, all properties of the problem are then given by the 
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p.d.f. pm(E). In our case, the latter is straightforward 
to compute: 



Pm(E) = 



^2{5(E-\J2n jaj \)), 



(6) 



where the prime means that ^\ r n! i = M. Hence pm{E) 
is equal to the probability distribution of the absolute 
value of N numbers drawn from P(a), which is 



Pm{E) = 
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Let us concentrate on the non-biased case fi = (the 
calculus is easily extended to the biased case). We 
are interested in the average value of the minimum E\ 
of (^) numbers drawn from Pm(E). Using P{E\) = 
- 3 | r [F > (Bi)] L , where F > (E 1 ) = erfc(^i/ v / 2M) is the 
cumulative distribution function of Pm(E), we find that 



M /A r d *[erfc(t)](M) 
M V M J 1 WJ 



(8) 



If M = N, one recovers the average over all components 
since J Q dterfc(t) = Xj^fii. By definition, the median is 
given by e mcd such that J^" lcd did P(-Bi) = 1/2. 
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FIG. 2. Constrained problem. Left panel: average optimal 
error (e) versus the number of components M for a total of 
N = 10 (circles) and 20 (squares) (average taken over 20000 
samples). Right panel: median error versus M. Continu- 
ous lines are the predictions from the random cost approach. 



The dashed line is 
components. 



, the average error is taken over all 



The left panel of Fig. || plots the average constrained 
error, obtained by numerical enumeration, and the ana- 
lytical predictions of Eq (||) , for two sizes of component 
set. The larger the component pool N, the better the 
precision. For M << N the random cost approach de- 
scribes (e) well, however it fails dramatically for larger 
values of M. This is because the L = (»,) values of E 



become increasingly dependent as M/N grows. At fixed 
N, as M increases, particular samples have an increasing 
probability to contain a large fraction of defects a, with 
the same sign. Due to the constraint on M, one may 
therefore be forced to use components whose defects add 
instead of compensating each other. The median, which 
is less affected by such events, has its mimimum close to 
N/2 (right panel of Fig. ||): the random cost approach 
describes much better the behavior of the median than 
that of the average, although the discrepancy increases 
as M/N increases. 
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FIG. 3. Binary components: average fraction {(j>) of perfect 
devices as a function of TV for P = 100 and / = 0.2 (filled 
circles), / = 0.25 (squares) and / = 0.3 (filled diamonds). 
Inset: running time versus TV of simple enumeration (x) and 
enumeration with sorting (+) (/ = 0.2). Averages taken over 
10000 samples. Dashed lines are for guidance to the eye. 

Using a subset of defective components is also a power- 
ful method for binary components such as quantum-dot 
optical switches . Suppose each component has I in- 
put bits. If it can perform F different logical operations 
on the input bits, it can perform P = F2 1 different trans- 
formations (i.e. truth table has P entries). Let / be the 
probability that for a given transformation I, component 
i systematically gives the wrong output, e.g. because 
of inaccuracies in the energy-level spacings in the case of 
quantum dot switches. Mathematically, P{a\ = — 1) = /, 
— 1 labelling a defective ouput and 1 a correct one. It 
becomes exponentially unlikely that one can extract a 
perfect component as / increases, however subsets of the 
components may indeed produce the correct output. One 
therefore selects a subset of components from a pool of 
N, in order to maximize the number of transformation 
such that the majority of components give the correct an- 
swer. The maximal fraction of working transformations 
for a given set of components {a{\ is 
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where 0(x) is the Heaviside function. We measured 
4> = (6w,i)> the average fraction of component sets with 
at least one perfect subset. Numerical calculations con- 
firm that it is indeed possible to build perfectly working 
components even if / is so high that no single compo- 
nent is perfect. Figure || shows numerical simulations of 
the probability 4> versus N for three values of /. When 
cj) > 0, an efficient algorithm consists in first ranking 
(Heapsort) the components according to the number of 
working transformations and then enumerate all possi- 
bilities until a perfect combination is found, beginning 
with the less defective ones (see inset of Fig ||). Analytic 
results along the lines of Jn| will presented elsewhere. 

Admittedly, seeking optimal combinations implies an 
additional cost for two reasons. First, one has to find 
the optimal or near optimal combination; this can be 
done either by measuring all the defects and then find- 
ing the minimal error with a computer; or, skipping the 
labour-intensive step of measuring individual defects, by 
building combinations of objects such that we eventu- 
ally minimize the aggregate error. Second, these ob- 
jects have to be wired, and their output combined by 
an additional hopefully error-free device. However such 
wiring and selection of working subsets of components, is 
precisely what is already being done inside the Teramac 
H . Given that defective components can be cheaply pro- 
duced en masse, the cost of such wiring and selection of 
working combinations may not be an obstacle. Hence we 
believe that our two optimization problems may prove 
relevant in practice, in particular in emerging technolo- 
gies where the fabrication of defect-free components may 
not be possible. 

Our scheme implies that the 'quality' of a component is 
not determined solely by its own intrinsic error. Instead 
error becomes a collective property, which is determined 
by the 'environment' corresponding to the other defec- 
tive components. Efficient recycling of otherwise 'use- 
less' components now becomes possible. Suppose that 
a fabrication process produces a constant flow of defec- 
tive analog or binary components. One can now perform 
the following scheme to generate a continuous output of 
useful devices: fix N according to the desired average 
error (see Fig. [j], Fig. || or Fig ||); form the optimal 
subset; add fresh components to the unused ones; find 
the optimal subset, and repeat as desired. The quality of 
the subset fluctuates, but there is essentially no wastage. 
Although efficient algorithms for the analog case remain 
to be found, generalization of well-known algorithms pc| ] 
may be possible. We hope that our work inspires further 
academic research into this important practical problem. 

We thank R. Zecchina and D. Sherrington for useful 
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Scientific Research for financial support. 
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